Parallelizing Compilation through Load-Time Scheduling for a Superscalar Processor Family
نویسندگان
چکیده
Superscalar processors improve the execution time of sequential programs by exploiting instruction-level parallelism (ILP). The efficiency of parallelization at run-time can be increased through an additional scheduling phase for a concrete target machine in the compiler. But if the target machine is not known at compile-time, scheduling must be deferred to a later phase immediately before program execution. In this paper we present a novel technique, which prepares parallelization at compile-time and performs scheduling at load-time of a program. Our approach called CALS (Code Annotations for Load-time Scheduling) uses proof-carrying code techniques for scheduling in linear time by using a new algorithm. Additionally, the closely related task of register allocation is split between compile-time and load-time of a program. CALS achieves improvements of up to 23.8% over simple compilation without scheduling. It obtains results comparable to conventional list scheduling or even outperforms it by up to 12.4%.
منابع مشابه
SALT: Efficient Load-Time Scheduling for Superscalar Processor Families Using Compiler Annotations
Superscalar processors exploit instruction-level parallelism (ILP) by dispatching machine instructions to several functional units where they are executed in parallel. The efficiency of parallelization at run-time can be increased through an additional scheduling phase for a concrete target machine in the compiler. But if the mobile code should be executed in a heterogenous network with process...
متن کاملEffective Instruction Prefetching In Chip Multiprocessors
threaded application performance, often achieved through instruction level parallelism per chip is increasing, the software and hardware techniques to exploit the potential of studies mostly involve distributed shared memory multiprocessors and fetching will not be fully effective at masking the remote fetch latency. the effective address of the load instructions along that path based upon a hi...
متن کاملCompilation Support for Superscalar Processors
This thesis describes work done in two areas of compilation support for superscalar processors; register allocation and instruction scheduling. Chapter 1 describes an approach to register allocation for superscalar processors that supports dynamic and speculative out-of-order execution of instructions and guarantees precise interrupts without expensive hardware for managing register usage and m...
متن کاملInter-block Scoreboard Scheduling in a JIT Compiler for VLIW Processors
We present a postpass instruction scheduling technique suitable for Just-In-Time (JIT) compilers targeted to VLIW processors. Its key features are: reduced compilation time and memory requirements; satisfaction of scheduling constraints along all program paths; and the ability to preserve existing prepass schedules, including software pipelines. This is achieved by combining two ideas: instruct...
متن کاملIntegrating Parallelizing Compilation Technology and Processor Architecture for Cost-Effective Concurrent multithreading
As the number of transistors on a single chip continues to grow, it is important to think beyond the traditional approaches of compiler optimizations for deeper pipelines and wider instruction issue units to improve performance. This single-threaded execution model limits these approaches to exploiting only the relatively small amount of instruction-level parallelism available in application pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005